Vision and Feature Norms: Improving automatic feature norm learning through cross-modal maps

نویسندگان

  • Luana Bulat
  • Douwe Kiela
  • Stephen Clark
چکیده

Property norms have the potential to aid a wide range of semantic tasks, provided that they can be obtained for large numbers of concepts. Recent work has focused on text as the main source of information for automatic property extraction. In this paper we examine property norm prediction from visual, rather than textual, data, using cross-modal maps learnt between property norm and visual spaces. We also investigate the importance of having a complete feature norm dataset, for both training and testing. Finally, we evaluate how these datasets and cross-modal maps can be used in an image retrieval task.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Look, Imagine and Match: Improving Textual-Visual Cross-Modal Retrieval with Generative Models

Textual-visual cross-modal retrieval has been a hot research topic in both computer vision and natural language processing communities. Learning appropriate representations for multi-modal data is crucial for the cross-modal retrieval performance. Unlike existing image-text retrieval approaches that embed image-text pairs as single feature vectors in a common representational space, we propose ...

متن کامل

Multi- and Cross-Modal Semantics Beyond Vision: Grounding in Auditory Perception

Multi-modal semantics has relied on feature norms or raw image data for perceptual input. In this paper we examine grounding semantic representations in raw auditory data, using standard evaluations for multi-modal semantics, including measuring conceptual similarity and relatedness. We also evaluate cross-modal mappings, through a zero-shot learning task mapping between linguistic and auditory...

متن کامل

Cross-modal Sound Mapping Using Deep Learning

We present a method for automatic feature extraction and cross-modal mapping using deep learning. Our system uses stacked autoencoders to learn a layered feature representation of the data. Feature vectors from two (or more) different domains are mapped to each other, effectively creating a cross-modal mapping. Our system can either run fully unsupervised, or it can use high-level labeling to f...

متن کامل

HESITANT FUZZY INFORMATION MEASURES DERIVED FROM T-NORMS AND S-NORMS

In this contribution, we first introduce the concept of metrical T-norm-based similarity measure for hesitant fuzzy sets (HFSs) {by using the concept of T-norm-based distance measure}. Then,the relationship of the proposed {metrical T-norm-based} similarity {measures} with the {other kind of information measure, called the metrical T-norm-based} entropy measure {is} discussed. The main feature ...

متن کامل

Unsupervised Learning of Multimodal Features: Images and Text

In the following sections, we present the network architectures we use to learn bi-modal and cross-modal features. We describe an experimental setting which demonstrates that we are indeed able to learn features that effectively capture information from different modalities and that we can further improve on computer vision features if we have other modalities (e.g text) available during featur...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016